Virtue: Performance Visualization of Parallel and Distributed Applications
نویسندگان
چکیده
44 Computer H igh-speed, wide-area networks have made it both possible and desirable to interconnect geographically distributed applications that control distributed collections of scientific data, remote scientific instruments, and highperformance computer systems. Such an application might, for example, control a remote radio telescope, transmit raw data from the telescope site to a distributed data archive, and concurrently convolve the data to create images for real-time visualization. Developing just such a distributed application infrastructure is the goal of our partners in the National Computational Science Alliance, one of the NSF Partnerships for an Advanced Computational Infrastructure. Although interconnecting these applications enables geographically distributed science and engineering teams to collaborate in new ways, the resultant distributed computations pose significant performance analysis and optimization challenges. First, the execution environments of geographically distributed applications are far less deterministic than those of locally distributed, parallel applications.1 Network bandwidths and latencies, computing resources, and available data repositories can vary from one execution to another, and even during a single execution. Consequently, identifying and correcting performance bottlenecks exposed during one execution may not benefit later executions. Second, distributed applications are highly complex. Application components execute atop disparate system software and hardware. Real-time instruments often impose scheduling and access constraints. Accessing data repositories sometimes necessitates data translation for correlation with experimental or computational data. Finally, enabling effective remote interaction and visualization necessitates quality-ofservice (QoS) guarantees. Incorporating QoS into these hardware and software systems further increases their complexity. Historically, performance analysis has focused on monolithic applications executing on large, standalone, parallel systems. In such a domain, measurement, postmortem analysis, and code optimization suffice to eliminate performance bottlenecks and optimize applications. Most existing performance analysis systems—for example, SvPablo,2 Medea,3 and Paragraph4—use only postmortem analysis. To tune the emerging distributed applications, however, a new generation of online performance measurement and optimization tools must adapt application behavior dynamically as resource availability changes. In addition to providing real-time adaptive control, new performance tools must gather data from multiple sources and software levels (application, library, system, and network). Furthermore, these tools must enable geographically dispersed teams to collaborate in identifying and correcting performance problems. This capability requires support of distributed visualization and control, as well as support of both synchronous and asynchronous collaboration. The Virtue prototype exploits human sensory capabilities to help performance analysts explore and optimize large-scale, multidisciplinary applications. The visualization environment lets collaborators interact with executing software, tuning its behavior to meet performance goals. Cover Feature Virtue: Performance Visualization of Parallel and Distributed Applications
منابع مشابه
Flexible performance visualization of parallel and distributed applications
Performance debugging of parallel and distributed applications can benefit from behavioral visualization tools helping to capture the dynamics of the executions of applications. The Pajé generic tool presented in this article provides interactive and scalable behavioral visualizations; because of its genericity, it can be used unchanged in a large variety of contexts. © 2002 Elsevier Science B....
متن کاملVisualization of Parallel Execution Graphs
Measuring and evaluating the runtime of parallel programs is a diicult task. In this paper we present tools for performance evaluation and visualization in the distributed thread system (DTS), a programming environment for portable parallel applications. We describe the visualization of a parallel trace log as an execution graph using a novel layout algorithm which has been tailored to expose t...
متن کاملA Parallel Debugger with Support for Distributed Arrays, Multiple Executables and Dynamic Processes
In this paper we present the parallel debugger DETOP with special emphasis on new support for debugging of programs with distributed data structures such as arrays that have been partitioned over a number of processors. The new array visualizer within DETOP supports transparent browsing and visualization of distributed arrays which occur in languages such as High Performance Fortran. Visualizat...
متن کاملA Steering and Visualization Toolkit for Distributed Applications
Parallel and high performance computing has enabled great strides to be made in advancing science and solving large problems. However, this progress is limited by the lack of needed tools and the difficulty of programming and running parallel applications. Specifically, there is a lack of needed steering and visualization tools, which can be easily integrated in existing applications. This pape...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- IEEE Computer
دوره 32 شماره
صفحات -
تاریخ انتشار 1999